Skip to content

feat: initial scaffold and core implementation of agent-kernel#1

Merged
dgenio merged 19 commits intomainfrom
copilot/init-agent-kernel-implementation
Mar 4, 2026
Merged

feat: initial scaffold and core implementation of agent-kernel#1
dgenio merged 19 commits intomainfrom
copilot/init-agent-kernel-implementation

Conversation

Copy link
Contributor

Copilot AI commented Mar 2, 2026

Implements agent-kernel from scratch — a capability-based security kernel for AI agents operating in large tool ecosystems (1000+ tools via MCP, A2A, internal APIs). Provides the authorization, execution, and audit layer sitting above raw tool execution and below the LLM context window.

Package structure (src/ layout, Python ≥ 3.10, Apache-2.0)

  • enums.pySafetyClass (READ/WRITE/DESTRUCTIVE), SensitivityTag (PII/PCI/SECRETS/NONE)
  • errors.py — 10-class exception hierarchy; no bare ValueError/KeyError anywhere
  • models.py — Core dataclasses: Capability, Principal, Frame, Handle, ActionTrace, Budgets, etc.
  • registry.pyCapabilityRegistry with deterministic keyword-overlap search (no LLM, no vector DB)
  • tokens.pyCapabilityToken + HMACTokenProvider (HMAC-SHA256); tokens bind principal_id + capability_id + constraints for confused-deputy prevention
  • policy.pyDefaultPolicyEngine: READ always allowed; WRITE requires justification ≥ 15 chars + writer|admin role; DESTRUCTIVE requires admin; PII/PCI enforces tenant attribute + allowed_fields; max_rows capped at 50 (user) / 500 (service)
  • router.pyStaticRouter with ordered fallback driver chains
  • drivers/InMemoryDriver (Python callables + 200-record deterministic billing dataset), HTTPDriver (httpx async)
  • firewall/Firewall transforms RawResult → Frame; four response modes (summary/table/handle_only/raw); enforces Budgets; regex-based PII/PCI redaction; deterministic summarisation
  • handles.pyHandleStore with TTL, lazy eviction, pagination (offset/limit), field selection, equality filtering
  • trace.py / kernel.pyTraceStore + Kernel main entry point wiring all components

Quickstart

kernel = Kernel(registry, router=StaticRouter(routes={"tasks.list": ["memory"]}))
kernel.register_driver(driver)

token = kernel.get_token(CapabilityRequest("tasks.list", goal="list tasks"), principal, justification="")
frame = await kernel.invoke(token, principal=principal, args={})
# frame.facts  →  ['Total rows: 20', 'Top keys: id, title, done', ...]
# frame.handle →  Handle(handle_id='...', total_rows=20, ...)

expanded = kernel.expand(frame.handle, query={"limit": 3, "fields": ["id", "title"]})
trace = kernel.explain(frame.action_id)   # full audit record

Testing & tooling

  • 107 pytest tests, 94% coverage across all modules
  • pyproject.toml (PEP 621, hatchling), Makefile (fmt/lint/type/test/example/ci)
  • GitHub Actions CI matrix: Python 3.10 / 3.11 / 3.12 with explicit permissions: contents: read
  • Three self-contained examples (no internet): basic_cli.py, billing_demo.py, http_driver_demo.py
  • Docs: architecture.md, security.md, integrations.md, capabilities.md, context_firewall.md
Original prompt

Create the initial scaffold and core implementation for agent-kernel, a Python library that implements a capability-based security kernel for AI agents operating in large tool ecosystems (1000+ tools via MCP, A2A, internal APIs).

This library sits ABOVE contextweaver (a context compilation library, available as a dependency) and provides the authorization, execution, and audit layer.

What this library does

  1. Capability Registry: register task-shaped capabilities (not raw tools) with safety classes and sensitivity tags.
  2. Capability Tokens: HMAC-signed, time-bounded, principal-scoped tokens that authorize specific actions.
  3. Policy Engine: role-based access control with confused-deputy prevention. READ/WRITE/DESTRUCTIVE safety classes, PII/PCI sensitivity handling.
  4. Drivers: pluggable execution layer (InMemoryDriver for testing, HTTPDriver for real APIs, protocol-agnostic MCP adapter interface).
  5. Context Firewall: transforms raw tool output into budgeted Frames (facts + table preview + handles). Never exposes raw output to the LLM by default.
  6. Audit Trail: every action is traced and explainable via kernel.explain(action_id).

Package details

  • Package name: agent_kernel
  • Python >= 3.10
  • pyproject.toml with PEP 621, src/ layout
  • License: Apache-2.0
  • Runtime deps: httpx (for HTTPDriver)
  • Dev deps: pytest, pytest-cov, pytest-asyncio, ruff, mypy
  • [tool.pytest.ini_options] asyncio_mode = "auto"

Repository structure

agent-kernel/
├── pyproject.toml
├── Makefile                    # fmt, lint, type, test, example, ci
├── LICENSE                     # Apache-2.0
├── README.md
├── CHANGELOG.md
├── CONTRIBUTING.md
├── AGENTS.md                   # AI agent instructions for working in this repo
├── .gitignore
├── .github/workflows/ci.yml   # Python 3.10, 3.11, 3.12: ruff + mypy + pytest
├── docs/
│   ├── architecture.md         # Component deep-dive + Mermaid diagram
│   ├── security.md             # Threat model, confused deputy, token scopes
│   ├── integrations.md         # MCP integration, custom drivers, capability mapping
│   ├── capabilities.md         # Designing good capabilities, naming conventions
│   └── context_firewall.md     # Budgets, frames, handles, redaction, expand
├── examples/
│   ├── basic_cli.py            # Full flow: request → grant → invoke → expand
│   ├── billing_demo.py         # InMemoryDriver with dataset, budgets, handles, pagination
│   └── http_driver_demo.py     # Local mini HTTP server + HTTPDriver (no internet needed)
├── src/
│   └── agent_kernel/
│       ├── __init__.py         # Public API exports + __version__
│       ├── py.typed            # PEP 561
│       ├── models.py           # Core dataclasses: Capability, CapabilityRequest, CapabilityGrant,
│       │                       #   Principal, PolicyDecision, RoutePlan, ImplementationRef,
│       │                       #   RawResult, Frame, Handle, Provenance, ActionTrace,
│       │                       #   Budgets, FieldSpec, ResponseMode
│       ├── enums.py            # SafetyClass (READ/WRITE/DESTRUCTIVE),
│       │                       #   SensitivityTag (PII/PCI/SECRETS/NONE)
│       ├── errors.py           # AgentKernelError, TokenExpired, TokenInvalid, TokenScopeError,
│       │                       #   PolicyDenied, DriverError, FirewallError, CapabilityNotFound,
│       │                       #   HandleNotFound, HandleExpired
│       ├── registry.py         # CapabilityRegistry: register, lookup, keyword-based request matching
│       ├── policy.py           # PolicyEngine protocol + DefaultPolicyEngine (rule-based):
│       │                       #   READ allowed, WRITE needs justification+role, DESTRUCTIVE needs admin,
│       │                       #   PII/PCI requires tenant attribute, max_rows enforcement
│       ├── tokens.py           # CapabilityToken dataclass, TokenProvider protocol,
│       │                       #   HMACTokenProvider (SHA-256, env secret, expiry, signature verify)
│       ├── router.py           # Router protocol + StaticRouter (first match + fallback)
│       ├── drivers/
│       │   ├── __init__.py
│       │   ├── base.py         # Driver protocol, ExecutionContext, RawResult
│       │   ├── memory.py       # InMemoryDriver (simulated capabilities with Python functions)
│       │   └── http.py         # HTTPDriver (httpx-based, timeouts, error mapping)
│       ├── firewall/
│       │   ├── __init__.py
│       │   ├── budgets.py      # Budgets dataclass (max_rows, max_fields, max_chars, max_depth)
│       │   ├── transform.py    # Firewall class: RawResult → Frame with budget enforcement
│       │   ├── redaction.py    # PII/PCI field redaction (email, phone, card_number, ssn)
│       │   └── summarize.py    # Deterministic summarization heuristics (no LLM):
│       │                       #   list-of-dicts → count+stats+top_keys,
│       │                       #   dict → keys+aggregates, str...

</details>



<!-- START COPILOT CODING AGENT TIPS -->
---

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. [Learn more about Advanced Security.](https://gh.io/cca-advanced-security)

Copilot AI and others added 2 commits March 2, 2026 20:01
Co-authored-by: dgenio <12731907+dgenio@users.noreply.github.com>
Co-authored-by: dgenio <12731907+dgenio@users.noreply.github.com>
Copilot AI changed the title [WIP] Create initial scaffold for agent-kernel library feat: initial scaffold and core implementation of agent-kernel Mar 2, 2026
@dgenio dgenio requested a review from Copilot March 3, 2026 23:25
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Initial implementation of the agent-kernel library: a capability-based authorization + execution kernel for agents, including policy gating, HMAC-signed capability tokens, driver routing/execution, a context firewall (Frame/Handle), and an audit trail.

Changes:

  • Added core runtime modules (models, registry, policy, tokens, router, kernel, handle/trace stores, drivers, firewall).
  • Added a full pytest suite plus fixtures to validate end-to-end flows and security properties.
  • Added packaging/tooling, CI workflow, docs, and runnable examples.

Reviewed changes

Copilot reviewed 46 out of 47 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
.github/workflows/ci.yml CI matrix for lint/format/type/test/examples.
AGENTS.md Repo conventions and security/quality guidelines for agents.
CHANGELOG.md Project changelog scaffold.
CONTRIBUTING.md Contributor workflow and quality bar.
Makefile Local developer commands aligned with CI.
README.md Project overview, architecture, and quickstart.
docs/architecture.md High-level architecture and component diagram.
docs/capabilities.md Guidance for capability naming and design.
docs/context_firewall.md Firewall response modes, budgets, handles, redaction.
docs/integrations.md Driver integration guidance (MCP/HTTP/custom).
docs/security.md Threat model and security properties.
examples/basic_cli.py End-to-end demo of request → token → invoke → expand → explain.
examples/billing_demo.py Demo using deterministic billing dataset + budgets + expansion.
examples/http_driver_demo.py Demo running a local HTTP server with HTTPDriver.
pyproject.toml Packaging metadata + dependencies + ruff/mypy/pytest config.
src/agent_kernel/init.py Public API exports and version.
src/agent_kernel/drivers/init.py Driver subpackage exports.
src/agent_kernel/drivers/base.py Driver protocol + execution context.
src/agent_kernel/drivers/http.py Async HTTP execution driver based on httpx.
src/agent_kernel/drivers/memory.py In-memory driver + deterministic billing dataset factory.
src/agent_kernel/enums.py SafetyClass and SensitivityTag enums.
src/agent_kernel/errors.py Custom exception hierarchy.
src/agent_kernel/firewall/init.py Firewall subpackage exports.
src/agent_kernel/firewall/budgets.py Firewall budgets dataclass.
src/agent_kernel/firewall/redaction.py Regex + field-name based redaction utilities.
src/agent_kernel/firewall/summarize.py Deterministic summarization heuristics.
src/agent_kernel/firewall/transform.py Core RawResult → Frame transformer enforcing budgets/modes.
src/agent_kernel/handles.py HandleStore with TTL + expand (pagination/filters/fields).
src/agent_kernel/kernel.py Main orchestration: token verify → route → execute → firewall → trace.
src/agent_kernel/models.py Core dataclasses: Capability, Principal, Frame, Handle, ActionTrace, etc.
src/agent_kernel/policy.py DefaultPolicyEngine rules + constraint enforcement.
src/agent_kernel/py.typed PEP 561 marker for typed package.
src/agent_kernel/registry.py Capability registry + keyword-overlap search.
src/agent_kernel/router.py Static routing from capability_id → ordered driver chain.
src/agent_kernel/tokens.py CapabilityToken serialization + HMACTokenProvider signing/verify.
src/agent_kernel/trace.py TraceStore for in-memory audit traces.
tests/conftest.py Shared fixtures for kernel, principals, registry, drivers.
tests/test_drivers.py Driver unit tests (InMemoryDriver + HTTPDriver).
tests/test_firewall.py Firewall mode/budget/redaction behavior tests.
tests/test_handles.py HandleStore TTL/eviction/expand behavior tests.
tests/test_kernel.py Integration tests for full kernel flows + fallback + token scope.
tests/test_models.py Dataclass construction and serialization tests.
tests/test_policy.py DefaultPolicyEngine rule tests.
tests/test_registry.py CapabilityRegistry registration/search tests.
tests/test_router.py StaticRouter routing semantics tests.
tests/test_tokens.py HMACTokenProvider issuance/verify/tamper/expiry tests.
tests/test_trace.py TraceStore record/get/list tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@dgenio dgenio marked this pull request as ready for review March 4, 2026 07:08
@dgenio dgenio merged commit f50b245 into main Mar 4, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants